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Abstract: 

The  scheduling  problem  for  unit  lime  task  systems  with  arbitrary'  precedence 
constraints  is  known  to  be  NP-completc.  We  show  that  the  same  is  true  even 
if  the  precedence  constraints  are  restricted  to  certain  subclasses  which  make 
the  corresponding  parallel  programs  more  structured.  Among  these  classes 
are  those  derived  from  hierarchic  cobegin-coend  programming  constructs, 
level  graph  forests,  and  the  parallel  or  serial  composition  of  an  >ul-trec  and 
an  in-tree.  In  each  case,  the  completeness  proof  depends  heavily  on  the 
number  of  processors  being  part  of  the  problem  instances.  ^ 
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1.  Introduction 


Technological  progress  has  made  it  possible  to  design  and  construct  computer  architectures  with  a  large 
number  of  processors.  The  intention  is  to  make  use  of  the  apparent  mutual  independence  of  many 
activities  in  (sequential  or  parallel)  programs  or  task  systems,  thus  achieving  shorter  overall  execution 
times.  Because  this  hardware  -  time  tradeoff  is  one  of  the  main  justifications  to  build  parallel  computers, 
the  scheduling  problem,  i.c.,  the  problem  to  assign  activities  to  processors  such  as  to  respect  their  inherent 
precedence  constraints  and  simultaneously  to  minimize  time,  has  attracted  considerable  practical  and 
theoretical  interest. 

For  finite  task  systems,  the  scheduling  problem  could  in  principle  be  solved  by  enumerating  all  possible 
schedules  and  comparing  them.  However,  in  general  it  docs  not  make  any  sense  at  all  to  invest  much  more 
time  in  finding  a  good  schedule  than  this  schedule  can  then  save.  Unfortunately,  it  turned  out  very  soon 
that  already  basic  variants  of  the  scheduling  problem  belong  to  the  class  of  combinatorial  optimization 
problems  which  arc  NP-completc  and  for  which  only  more  or  less  cnumcrativc  solution  methods  of 
exponential  complexity  arc  known  [4,9,11],  Efficient  algorithms  which  produce  optimal  schedules  and 
require  only  polynomial  time  arc  known  only  for  the  following  few  cases: 

(i)  the  scheduling  on  an  arbitrary  number  of  identical  processors  of  a  unit-time  task  system  whose 
precedence  constraints  form  an  in-forest  (resp.,  out-forest)  [8]; 

(ii)  the  scheduling  of  an  arbitrary  unit-time  task  system  on  two  identical  processors  [3]; 

(iii)  the  scheduling  on  an  arbitrary  number  of  identical  processors  of  a  unit-time  task  system  whose 
incomparability  graph  is  chordal  [10]. 

For  an  extended  list  of  complexity  results  for  scheduling  problems,  see  [2,7], 

While  in  [11]  arbitrary  precedence  constraints  arc  used  to  show  NP-completeness  of  the  scheduling 
problem  of  unit-time  task  systems  on  an  arbitrary  number  of  identical  processors,  it  is  the  purpose  of  this 
paper  to  show  that  even  well  structured  precedence  constraints  arc  of  no  help.  In  particular,  we  pro-.'  'it 
precedence  constraints  as  they  derive  from  parallel  constructs  in  programming  languages  like  Ci 
Pascal  or  Algol  68  still  render  the  scheduling  problem  NP-completc.  The  same  holds  true  for  precedence 
constraints  consisting  of  forests  of  level  graphs  or  in-trees  and  one  or  more  out-trees  (or  symmetrically, 
out-trees  and  one  or  more  in-trecs).  In  the  reductions  employed  for  the  proofs,  the  number  of  available 
processors  plays  a  significant  role,  an  observation  in  support  of  the  difficulty  (or,  maybe,  impossibility)  to 
show  NP-complctcncss  for  a  fixed  number  of  processors. 
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2.  Preliminaries  and  notation 
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A  task  system  is  a  finite  set  T  =  (tit  of  tasks.  For  our  purposes,  all  tasks  in  T  require  unit  time 

to  get  executed.  A  precedence  constraint  on  a  task  system  T  is  a  partial  order  -<  over  T.  The  relation 
ti~<tj  means  that  the  execution  of  tj  cannot  start  until  U  is  finished.  In  the  sequel,  we  usually  represent 
(T,  -<)  by  a  directed  acyclic  graph  (dag)  with  node  set  T  and  an  edge  from  t,  to  t3  iff  tt-<tj  and  there 
is  no  tk  such  that  t{-<tk~<tj  (i.e.,  all  transitive  edges  are  omitted). 

A  schedule  for  (T,  -<)  on  m  processors  (m  €  N)  is  a  mapping  a  from  T  onto  some  initial  segment 
{1,...,/}  of  N  such  that 

(i)  U  -< tj  =>  «(f,)  <  s(tj)  for  all  tit  t3  6  T\ 

(ii)  l  <  |s~'(r)|  <  m  for  all  r  G  {1 . !}• 

We  say  that  t  €  T  is  executed  at  (time-)stcp  s(<),  and  we  call  l  the  length  of  the  schedule  a.  Note  that  if 
t{  and  tj  arc  executed  at  the  same  time-step  they  must  be  incomparable  with  respect  to  i.e.,  neither 
U<tj  nor  tj -< t,  holds. 

An  instance  of  the  scheduling  problem  for  unit- time  task  systems  on  an  arbitrary  number  of  (identical) 
processors  is  given  by  a  quadruple  (T,  m,  l )  where 

(i)  T  is  a  finite  task  system,  without  loss  of  generality  denoted  by  the  numerals  for  1  through  |T|; 

(ii)  -<  is  a  partial  order  over  T,  without  loss  of  generality  denoted  by  a  list  of  the  edges  in  the  dag 
defined  by  -<  as  noted  above; 

(iii)  m  and  /  arc  positive  integers  in  radix  notation. 

Theorem  ([11]); 

The  set 

{(T,  -<,m, l)\  there  is  a  schedule  for  (T,  -<)  on  m  processors  of  length  <  1} 
is  NP-completc. 

In  the  proof  of  this  theorem  in  [11],  task  systems  with  precedence  constraints  from  a  completely  general 
class  arc  used.  Wc  want  to  improve  on  the  above  result  by  showing  that  precedence  constraints  of  a  very 
natural  structure  suffice  to  make  the  above  scheduling  problem  NP-complete. 

Definition: 

A  dag  ( directed  acyclic  graph )  is  a  hierarchic  parallel  graph  (HPG)  if  and  only  if  it  can  be  obtained  in  a 
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finite  number  of  steps  from  the  axiom  •  by  the  graph  grammar  with  the  following  rules: 


<  > 

(i)  any  node  •  can  be  replaced  by 


(ii)  any  node  •  can  be  replaced  by 


(Note  that  all  edges  point  downward;  thus  all  edges  entering  a  node  which  is  being  replaced,  afterwards 
enter  the  topmost  node  of  the  replacing  graph,  and  correspondingly,  the  edges  which  leave  the  node  then 
become  outgoing  edges  of  the  bottommost  node.) 


We  should  like  to  remark  that  HPG’s  are  closely  related  to  parallel  control  structures  of  programs  in  high 
level  programming  languages  like  Concurrent  Pascal  [1]  or  Algol68  fl2]. 


Let  H  be  some  directed  acyclic  graph.  Then  H  defines,  in  an  obvious  way,  a  task  system  TH  —  the  set 
of  nodes  of  H  —  together  with  a  partial  order  ■—  the  partial  order  over  T„  generated  by  the  edges 
of  H. 


Theorem  1: 

The  Hierarchic  Programs  Scheduling  Problem 

HPSP=d«r  {(//,  m,  /);  II  is  an  HPG,  and  there  is  a  schedule  for  (T,/,  -<//)  on  m  processors  of  length 

<<} 

is  NP-complete. 

It  is  clear  that  HPSP  is  in  NP.  As  a  matter  of  fact,  it  is  not  hard  to  test  whether  a  given  graph  is  an  HPG, 
and  then  HPSP  is  a  restriction  of  the  general  scheduling  problem  of  [11]  which  is  in  NP.  In  the  next  two 
sections  we  will  show  that  HPSP  is  NP-complctc.  This  is  achieved  by  efficiently  reducing  to  HPSP  the 
satisfiability  problem  3SAT  for  sets  of  clauses  with  three  different  literals  each  [5]. 
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3.  A  basic  task  system  which  is  hard  to  schedule 

Let  L  =  L,  A. . .  A Lr  be  a  propositional  formula  in  three  literal  conjunctive  normal  form  over  the  set  of 
variables  {x,, . . . ,  x„},  i.e.,  every  clause  L,  is  a  disjunction  of  three  different  literals  (from  three  different 
variables)  in  {xu  Zt, . . .  ,xn,z„}. 

We  first  present  a  directed  acyclic  graph  II'L  which  by  itself  is  not  an  HPG,  but  consists  of  2 n  HPG 
components,  and  which  is  hard  to  schedule.  For  the  time  being  we  assume  in  addition  that  the  number 
m  of  processors  available  in  the  system  is  not  constant  but  changes  in  a  predetermined  manner  at  every 
time-step.  We  will  then  show  in  the  next  section  how  to  dispose  of  this  assumption,  and  also,  how  to 
transform  Il'L  into  a  hierarchic  parallel  graph. 

Let  tI'L  be  defined  as  shown  in  Figure  1  where  all  edges  are  considered  as  directed  downward.  The  graph 
IV L  consists  of  2n  connected  components,  one  for  each  literal  in  {x,,xi,...,xn(xn}.  Hach  component 
has  exactly  n  +  2r  +  2  levels.  Within  each  component,  every  level  contains  cither  one  or  two  tasks.  The 
i-th  component  has  two  tasks  exaedy  on  level  f  ~  ]  and  on  all  levels  n  +  2 j  +  \  such  that  1  <  j  <  r 
and  the  literal  belonging  to  component  t  (which  is  x.f  if  i  is  odd,  and  x  j  if  i  is  even)  does  not  occur  in 
Lj.  Also,  within  each  component  every  task  on  level  t  has  (directed)  edges  going  to  every  task  on  level 
t  +  1,  for  all  1  <  *  <  n  +  2r  +  3,  and  two  tasks  on  any  level  are  always  followed  by  just  one  task  on 
the  next  level.  Obviously,  each  component  forms  an  HPG. 

The  next  two  lemmas  show  that  there  is  a  schedule  for  IVL  of  length  at  most  n  +  2r  +  3  if  and  only  if 
the  formula  L  is  satisfiable. 

lemma  1: 

If  L  is  satisfiable  then  there  is  a  schedule  for  (Tu>l,  <wl)  which  in  every  dme-step  uses  at  most  as  many 
processors  as  indicated  in  Figure  1,  and  whose  length  is  n  +  2r  +  3. 

Proof: 

Let  V  C  {xi,  Ji, . . . ,  xn,  x„}  be  the  set  of  literals  set  true  under  some  fixed  truth  assignment  to  the 

variables  . . .  that  satisfies  L,  and  let  V  be  the  set  of  those  components  of  IVL  corresponding  to 

literals  in  V.  Consider  the  schedule  s  which,  for  l  <  j  <  n  +  2r  +  2,  assigns  j  to  all  tasks  on  level  j 
in  components  in  V,  and  j  +  1  to  all  other  tasks  on  level  j.  We  claim  that  s  satisfies  die  condition  in 
the  lemma.  'ITiis  is  certainly  true  for  time-step  1  because  |F|  =  n.  In  time-step  2,  n  processors  arc  used 
to  execute  the  remaining  tasks  on  level  1,  and  another  n  +  1  processors  arc  used  to  execute  all  level  2 
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tasks  of  the  n  components  in  V.  This  is  possible  because  V  contains  either  x,  or  z,  but  not  both.  The 
same  reasoning  now  applies  up  through  time-step  n  -I-  2  after  which  exactly  the  n  tasks  on  level  n  +  2  in 
components  in  V  have  been  executed.  In  time-step  n  +  3,  n  processors  are  used  to  execute  the  remaining 
tasks  on  level  n  +  2.  Another  2«  -  3,  2n  -  2,  or  2n  -  1  processors  are  used  to  execute  all  tasks  on  level 
n  +  3  of  the  components  in  V,  depending  on  whether  3,  2,  or  1  literals  of  L,  are  in  V.  In  the  first  two 
cases,  two  resp.  one  of  the  available  3n  —  1  processors  remain  idle  at  time-step  n  +  3.  As  V  contains 
the  ’true’  literals  under  a  satisfying  assignment  for  L  it  contains  at  least  one  literal  of  Lt.  Thus,  2n  —  1 
processors  certainly  suffice  to  execute  all  tasks  on  level  n  +  3  of  the  components  in  V. 

In  the  next  time-step,  the  remaining  tasks  on  level  n  +  3  and  the  n  tasks  on  level  n  +  4  of  components 
in  V  are  executed  for  which  at  most  2n  +  n  =  3n  processors  arc  needed.  Again  we  may  now  observe 
inductively  that  after  time-step  n  +  2r  +  2  all  tasks  in  level  n  +  2r  +  1  have  been  executed  and  there  are 
exactly  those  n  tasks  on  level  n  -f  2r  +  2  left  which  are  not  in  components  in  V.  These  n  tasks  can  be 
scheduled  for  the  n  processors  available  in  time-step  n  +  2r  +  3.  | 


Lemma  2: 

If  there  is  a  schedule  for  (Th>l,  <  n'L)  of  length  at  most  n  +  2r  +  3  which  at  every  time-step  uses  at  most 
the  number  of  processors  indicated  in  Figure  1,  then  L  is  satisfiable. 

Proof: 

First  observe  that  any  task  on  level  *4- 1  can  be  executed  only  if  all  tasks  on  level  i  of  the  same  components 
have  been  executed  before.  As  there  arc  2n  components  each  of  which  has  exactly  n  +  2r  +  2  levels  and 
as  there  arc  only  n  processors  available  at  the  first  step,  every  admissible  schedule  for  {Twl,  has 
a  length  at  least  n  +  2r  +  3.  Further,  as  there  also  arc  only  n  processors  available  in  the  last  step  every 
admissible  schedule  a  for  (T//<t,  -<h’l)  of  length  n  +  2r  -I-  3  satisfies  the  following  property: 

There  is  a  set  V  of  exactly  n  components  of  If'L  such  that  for  all  j  with  1  <  j  <  n  +  2r  +  2, 

(0  under  s  all  tasks  on  level  j  of  components  in  V  arc  executed  at  time-step  j,  and  all  tasks 
on  level  j  of  components  not  in  V  arc  executed  at  time-step  j  +  l. 

Let  V  be  the  set  of  literals  belonging  to  the  components  in  V.  We  are  now  going  to  show  that  V  defines 
a  satisfying  truth  assignment  for  L  via  x{  :=truc  iff  z,-  €  V,  for  1  <  *  <  n.  Assume  first  that  there 
is  some  minimal  *,  1  <  i  <  n,  such  that  V  contains  z,  and  2;.  Then  n  +  2  processors  arc  needed  in 
time-step  i  -f  1  to  execute  all  tasks  on  level  i  +  I  of  the  components  in  V,  and  only  n  (resp.  n  -  1,  if 
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t  =  1)  processors  arc  left  to  complete  the  execution  of  level  t  of  II'L.  As  t  was  chosen  minimal,  there 
are,  however,  n  +  1  (resp.  n,  if  t  =  1)  tasks  left  on  level  t.  Hence,  V  must  contain,  for  every  i,  either 
x{  or  if. 

Next  assume  that  there  is  some  minimal  j,  1  <  j  <  r,  such  that  V  contains  no  literal  occurring  in 
Lj.  Then  2 n  processors  are  needed  in  time-step  n  +  2 j  +  1  to  execute  all  tasks  on  level  n  +  2j  +  1  of 
the  components  in  V ,  and  only  n  -  1  processors  are  available  to  execute  the  remaining  n  tasks  of  level 
n  +  2 j,  in  contra'diction  to  property  (I).  Hence,  V  must  contain,  for  every  j,  at  least  on  literal  in  Lj,  i.e., 

V  gives  rise,  in  the  way  indicated  above,  to  a  satisfying  truth  assignment  for  L.  | 

In  the  next  section,  we  shall  show  how  to  embed  II'L  in  a  hierarchic  parallel  graph  in  such  a  way  that 
at  each  time-step  exactly  the  proper  number  of  processors  is  available  for  the  tasks  in  the  embedded 
subgraph. 

\ 

4.  HPG’s  are  hard  to  schedule 
In  this  section,  we  prove  our  main 
Theorem  2: 

HPSP  is  NP-complcte. 

Proof: 

Let  L  and  II'L  be  as  in  the  previous  section.  We  now  define  the  instance  (///,,  m,  /)  of  the  Hierarchic 
Programs  Scheduling  Problem  with  II L  as  in  Figure  2,  m  =  3n  +  1,  and  /  =  n  +  2r  +  8. 

II L  has  n  +  2r  +  8  levels.  Note  that  every  directed  path  from  the  topmost  to  the  bottommost  node  of 
II l  which  travels  along  the  left  part  of  Hi  in  Figure  2  contains  n  +  2r  +  8  nodes.  As  a  consequence, 
every  schedule  of  length  <  1  has  in  fact  length  =  l  and  must  execute  these  tasks  level  by  level.  The 
construction  of  HL  thus  assures  that  for  all  time-steps  *  +  3  with  1  <  *  <  n  +  2r  +  3  the  number  of 
processors  available  for  the  right  part  of  IIL  in  Figure  2  (which  is  ll'L)  is  exactly  the  same  as  for  Il'L  in 
the  previous  section  at  time-step  i.  J  ^ 

IIL  obviously  is  an  HPG  and  can  be  constructed  from  L  in  polynomial  time  (though  we  omit  the  details 
of  this  construction).  This  establishes,  together  with  Lemmas  1  and  2,  the  claim  of  the  theorem.  | 

We  should  like  to  mention  that  HPSP  still  remains  NP-complcte  if  the  size  of  m  and  l  in  the  instances 
is  taken  from  their  unary  representation. 
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5.  Level  graphs  and  forests 


In  this  section,  we  extend  the  result  of  the  previous  section  to  a  class  of  seemingly  very  simple  precedence 
constraints. 


Definition: 

A  directed  acyclic  graph  H  is  a  level  graph  iff  its  node  set  TH  can  be  partitioned  into  sets 
such  that,  for  all  1  <  i  <  s,  there  is  an  edge  from  every  node  in  7,  to  every  node  in  Ti+l. 

A  level  forest  is  a  directed  acyclic  graph  consisting  of  finitely  many  level  graph  components. 


Note  that  every  component  of  H'L  in  Section  3  is  a  level  graph,  and  hence,  that  is  a  level  forest. 
Theorem  3: 

The  scheduling  problem  with  an  arbitrary  number  of  identical  processors  and  unit-time  task  systems  with 
level  forests  as  precedence  constraints  is  NP-complete. 

Proof: 

We  also  use  a  reduction  of  3SAT  to  the  above  problem.  We  noted  already  that  H'L  is  a  level  forest.  Now 
Figure  3  provides  an  embedding  of  //',  into  a  level  forest  graph  TTL  which  by  the  same  argument  as  in 
the  proof  of  Theorem  2  has  a  schedule  on  m  =  3n  +  1  processors  which  is  of  length  /  <  n  +  2r  +  3  if 
and  only  if  L  is  satisfiable.  | 
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6.  In-forests  with  one  out-tree 


While  trees  (in-trees  or  out-trees)  were  the  first  class  of  precedence  constraints  for  which  a  polynomial 
time  scheduling  algorithm  was  found  [8]  (a  result  which  easily  generalizes  to  in-forcsts  and  out-forests) 
we  shall  show  in  this  section  that  already  the  simplest  combination  of  the  two  kinds  of  trees  makes  th' 
scheduling  problem  hard. 

Let  MF  ( mixed  forest)  be  the  class  of  directed  acyclic  graphs  each  of  whose  components  is  either  an 
in-trcc  or  an  out-tree,  and  let  2MF  be  the  subclass  of  MF'  whose  members  have  at  most  two  components. 

Theorem  4: 

lhe  scheduling  problem  with  an  arbitrary  number  of  identical  processors,  unit-time  task  systems,  and 
with  elements  of  2MF'  as  precedence  constraints  is  NP-complcte. 

Corollary: 

The  scheduling  problem  for  MF-graphs  is  NP-completc. 

Proof  of  the  Theorem: 

A  variant  of  3SAT  which  is  also  NP-complcte,  is  Onc-in-three-3SAT,  i.e.,  the  problem  to  determine  for 
an  arbitrary  propositional  formula  L  in  3-conjunctivc  normal  form  whether  there  is  a  satisfying  truth 
assignment  to  the  variables  in  L  such  that,  in  every  clause  />,.  exactly  one  literal  is  assigned  true  (5).  For  a 
given  L  =  L,  A. . .  A  Lr  with  variables  vit . . . ,  vn,  we  construct  //  t  as  indicated  in  Figure  4.  II L  consists 
of  one  in-trec  and  one  out-tree  (again  all  edges  arc  considered  directed  downward). 

Further,  let  ///,  be  II L  without  its  level  2n+2r-M  nodes  and  the  incident  edges.  II  l  consists  of  2n 
connected  components  of  2n  +  2r  +  2  levels  each  (these  in-trcc  components  arc  called  r-components) 
and  two  components  of  2n  +  2r  +  3  levels  (called  l-components),  one  in-trcc  and  one  out-tree.  Bach 
r-componcnt  contains,  on  every  level,  cither  one  or  two  tasks,  and  the  i-th  r-componcnt  (which  belongs 
to  if  i  is  odd,  and  2j,  if  i  is  even)  has  two  tasks  on  levels  2f|]  and  2 j  +  1  for  all  j  ^  [ |], 
1  <  j  <  n,  as  well  as  on  level  2n  +  2 j  (resp.,  2n  f  2j  +  1)  if  the  corresponding  literal  docs  (resp.,  docs 
not)  occur  in  Lj. 

We  now  show  that  there  is  a  schedule  for  (7^,  -</5/ J  on  m  =  3n  +  3  processors  of  length  at  most 
2n  +  2r  +  3  if  and  only  if  /,  is  in  Onc-in-thrcc-3SAT. 

First,  let  V  C  { x,,Tlt ...,xn, 2„}  be  the  set  of  literals  set  true  under  some  fixed  truth  assignment  to 
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The  mixed  forest  11 L  where  Li  =  l,Vz2Vl„,  L2  =  x ,Vi,Va:n.  Lr  =  z2V3r,VxB 

the  variables  ztl . . . ,  zn  such  that  V  contains  exactly  one  literal  of  every  clause  L,  of  L,  and  let  V  be 
the  set  of  those  components  of  flL  determined  by  the  literals  in  V.  Consider  the  schedule  s  which,  for 
all  1  <  j  <  2«  +  2r  +  2,  assigns  j  to  all  tasks  on  level  j  of  all  the  r -components  in  V  and  the  two 
/-components,  and  j  +  1  to  all  tasks  on  level  j  of  all  the  other  r-componcnts.  Level  2n  +  2r  +  3  of  the 
/-components  is  assigned  2n  +  2r  +  3  under  a.  We  leave  it  to  the  reader  to  verify  that,  in  fact,  «  is  a 


correct  schedule  for  (T=  ,  -<  =  ). 

H  l  Ml' 

For  the  other  direction  assume  that  there  is  a  schedule  a  for  (7^=  ,  on  to  =  3n  +  3  processors, 
of  length  <  2n  +  2r  +  3.  As  the  /-components  of  Hi  consist  of  2n  f  2r  +  3  levels,  we  must  in  fact 
have  the  length  of  a  equal  2n  +  2r  +  3.  Now  note  that  also  because  of  the  /-components,  in  the  first  and 
last  time-step  at  most  n  processors  are  available  for  the  2n  r-components.  As  these  components  all  have 
2n  +  2r  +  2  levels  the  following  property  must  hold: 

There  is  a  set  V  of  exactly  n  r-componcnts  in  H L  such  that,  for  all  j  with  l  <  j  < 

2n  +  2r  4-  2,  under  a  all  tasks  on  level  j  of  components  in  V  arc  executed  at  or  prior  to 
^  time-step  j,  and  in  every  r-component  not  in  V,  there  is  at  least  one  task  on  level  j  not  yet 
executed  after  time-step  j  (i.c.,  its  value  under  s  is  >  j).  Furthermore,  all  tasks  on  level 
j  are  executed  at  the  latest  at  time-step  j  +  1. 

I.et  V  be  the  set  of  literals  belonging  to  the  r-components  in  V.  We  shall  show  that  V  defines,  as  in 
Lemma  2,  a  truth  assignment  satisfying  L,  and  also  that  V  contains  exactly  one  literal  of  every  clause  Lj 
of  L. 

As  2n  +  3  processors  arc  needed  to  execute  the  level  1  tasks  in  the  two  /-components,  and  because  of 
property  (II),  all  tasks  executed  at  time-step  1  arc  of  level  1.  Let  V  be  the  set  of  those  r-components 
whose  level  1  tasks  arc  executed  in  the  first  step,  and  let  V  be  defined  as  above. 

Assume  first  that  there  is  some  minimal  i  such  that  V  contains  cither  both  and  i,  or  none  of  the  two 
literals.  It  easily  follows  from  the  construction  of  I! L  and  property  (II)  that  after  time-step  2t  -  l 

a)  all  levels  j  with  1  <  j  <  2*  —  1  are  completed, 

b)  all  tasks  on  level  2 *  —  1  of  components  in  V  have  been  executed,  and 

c)  no  other  tasks  have  been  executed  so  far. 

At  time-step  2i,  three  (resp.,  n  +  2  if  t  =  I )  processors  arc  needed  to  execute  all  tasks  on  level  2i  of 
the  /-components,  and  2n  -  1  (resp.,  n  if  i  =  1)  processors  have  to  be  used  to  complete  level  2»  —  1. 
Therefore,  n  +  1  processors  arc  available  for  tasks  which  arc  on  levels  >  2*  and  executable  at  time-step 
2t.  If  V  contains  both  and  ?<,  these  n  +  1  processors  do  not  suffice  to  execute  the  n  +  2  tasks  on  level 
2i  *n  the  components  in  V,  contradicting  property  (II).  If  V  contains  neither  x,  nor  z,,  then  n  processors 
suffice  to  execute  all  tasks  on  level  2»  of  the  components  in  V,  and  the  one  remaining  processor  could 
be  used  for  any  task  on  some  level  >  2i  all  of  whose  predecessors  have  already  been  executed.  Let  us 
assume  instead  that  one  processor  is  added  to  the  m  processors  available  at  time-step  2i  +  L  It  follows 
from  the  construction  of  II  /„  however,  that  in  this  case  3n  +  5  processors  arc  necessary  at  time-step 
2i  +  1  to  assure  property  (II).  Thus,  we  again  obtain  a  contradiction,  and  wc  conclude  that,  for  all  *,  V 
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must  contain  cither  z,  or  z,.  Furthermore,  a  simple  counting  argument  shows  that,  under  a,  in  time-step 
j,  where  1  <  j  <  2n  -M,  only  tasks  on  levels  j  or  j  —  1  are  executed. 

Now  assume  that  there  is  some  minimal  j,  1  <  j  <  r,  such  that  V  does  not  contain  exactly  one  literal 
of  Lj.  Then,  by  an  argument  analogous  to  the  one  just  presented,  we  achieve  a  contradiction  to  property 
(II)  at  time-step  2 n  +  2 j  if  V  contains  more  than  one  literal  of  and  at  time-step  2 »  +  2j  +  1  if  V 
contains  no  literal  of  L,  at  all.  Hence,  V  provides  a  truth  assignment  for  xt, . . . ,  xn  showing  that  L  is 
in  One-in-three-3SAT. 

It  is  now  immediate  that  there  is  a  schedule  for  {Tf,L>  -<  f,L)  on  m  =  3n  +  3  processors  of  length 
l  <  2n  +  2r  +  4  if  and  only  if  L  is  a  member  of  One-in-three-3SAT.  Again  we  leave  it  to  the  reader  to 
convince  himself  that  the  above  reduction  can  be  carried  out  in  polynomial  time.  | 

The  result  stated  in  Theorem  4  has  independently  been  obtained  in  [6]. 

As  a  further  corollary  of  Theorem  4  and  the  construction  of  II L  we  obtain  that  the  scheduling  problem  for 
precedence  constraints  decomposable  into  an  out-tree  and  an  in-trcc  opposing  each  other  is  NP-complete. 
This  follows  immediately  if  we  add  to  fl  L  one  node  (at  the  top)  with  outgoing  edges  to  all  nodes  in 
Hl  without  predecessor,  and  a  second  node  (at  the  bottom)  with  incoming  edges  from  all  nodes  of  II L 
without  successor. 
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7.  Conclusion 


There  are  several  conclusions  we  should  like  to  point  at  which  can  be  drawn  from  the  results  presented 
in  the  previous  sections.  The  first  is  that  restricting  the  precedence  constraints  to  be  either  in-forcsts 
or  out-forests  allows  a  polynomial  scheduling  algorithm,  but  that  relinquishing  this  restriction  slightly  in 
either  one  of  a  number  of  directions  immediately  renders  the  scheduling  problem  NP-complete.  We  have 
shown  this  to  hold,  for  example,  for  the  parallel  composition  of  an  out-tree  and  an  in-tree  as  well  as  for 
their  serial,  opposing  composition.  The  latter  might  seem  a  little  bit  surprising  in  view  of  the  polynomial 
scheduling  algorithms  for  in-  and  out-trees,  respectively.  But  it  is  the  intricate  interleaving  of  the  two 
trees  on  different  levels  which  makes  them  so  difficult  to  schedule  together. 

We  also  showed  that  restricting  the  precedence  constraints  to  a  subclass  which  is  widely  considered  well- 
structured  and  which  forms  a  subset  of  the  precedence  constraints  originating  from  parallel  constructs 
in  high  level  programming  languages  does  not  help,  this  subclass  is,  in  a  sense,  as  hard  to  schedule 
as  the  general  class.  Again  the  nicely  structured  precedence  constraints  still  allow  the  encoding  of  an 
NP-complctc  combinatorial  problem. 

The  last  observation  is  that  in  all  the  reductions  given  in  the  previous  sections,  the  number  of  parallel 
processors  is  part  of  the  problem  instance,  and  that  this  fact  is  heavily  made  use  of.  This  once  more 
supports  the  conjecture  that  it  might  not  be  possible  to  prove  the  scheduling  problem  on  some  fixed 
number  of  processors  to  be  NP-complete. 
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